This document presents the results obtained for Colorado.
3858 tweets sent during the disaster were filtered based on the 9 counties affected in the disaster –according to the reports: Arapahoe, Boulder, Denver, El Paso, Jefferson, Larimer, Logan, Morgan and Weld.
Only 2576 tweets sent during the “Flood” stage and from these counties were filtered for content analysis.
A quick view of the most common words in the whole dataset:
## # A tibble: 5,601 x 2
## word n
## <chr> <int>
## 1 boulder 1811
## 2 boulderflood 428
## 3 coflood 321
## 4 colorado 274
## 5 cowx 270
## 6 flood 155
## 7 rain 140
## 8 creek 121
## 9 amp 118
## 10 denver 107
## # … with 5,591 more rows
Again, since “boulder” is the most common word and is going to have a big effect in our topic modelling, it was removed from the dataset. The following four terms (“boulderflood”, “colorado”, “cowx”, “flood”) were also excluded because they were so common and used neutrally in all four stages. After excluding these six terms, the new list of common words looks as follows:
## # A tibble: 5,592 x 2
## word n
## <chr> <int>
## 1 flood 155
## 2 rain 140
## 3 creek 121
## 4 denver 107
## 5 flooding 106
## 6 day 87
## 7 water 71
## 8 people 70
## 9 time 68
## 10 park 67
## # … with 5,582 more rows
The statistic tf-idf was computed in order to identify which words are important in each of the flood stages. It measures how important a word is to a document in a collection (or corpus) of documents, in our case, it measures how important a word is to a tweet in a collection of tweets. In this case the collection of tweets was the set of tweets belonging to each stage.
## # A tibble: 566 x 6
## stage word n tf idf tf_idf
## <fct> <fct> <int> <dbl> <dbl> <dbl>
## 1 flood sirens___flood 20 0.00426 1.39 0.00591
## 2 postflood fitsocial___postflood 11 0.00421 1.39 0.00584
## 3 flood flood___flood 95 0.0202 0.288 0.00582
## 4 flood stapleton___flood 15 0.00320 1.39 0.00443
## 5 immediate_afterma… cofloodrelief___immediate_aft… 12 0.00299 1.39 0.00415
## 6 flood flash___flood 28 0.00597 0.693 0.00414
## 7 immediate_afterma… weld___immediate_aftermath 11 0.00274 1.39 0.00380
## 8 immediate_afterma… relief___immediate_aftermath 21 0.00524 0.693 0.00363
## 9 flood flashflood___flood 12 0.00256 1.39 0.00355
## 10 immediate_afterma… flood___immediate_aftermath 48 0.0120 0.288 0.00344
## # … with 556 more rows
## [[1]]
## png
## 2
##
## [[2]]
## png
## 2
##
## [[3]]
## png
## 2
##
## [[4]]
## png
## 2